Trading Consequences: A Case Study of Combining Text Mining & Visualisation to Facilitate Document Exploration
ثبت نشده
چکیده
Trading Consequences is an interdisciplinary research project between historians, computational linguists and visualization specialists. We use text mining and visualisations to explore the growth of the global commodity trade in the nineteenth century. Feedback from a group of environmental historians during a workshop provided essential information to adapt advanced text mining and visualisation techniques to historical research. Expert feedback is an essential tool for effective interdisciplinary research in the digital humanities. Summary
منابع مشابه
Trading Consequences: A Case Study of Combining Text Mining and Visualization to Facilitate Document Exploration
Large-scale digitization efforts and the availability of computational methods, including text mining and information visualization, have enabled new approaches to historical research. However, we lack case studies of how these methods can be applied in practice and what their potential impact may be. Trading Consequences is an interdisciplinary research project between environmental historians...
متن کاملBootstrapping a historical commodities lexicon with SKOS and DBpedia
Named entity recognition for novel domains can be challenging in the absence of suitable training materials for machine-learning or lexicons and gazetteers for term look-up. We describe an approach that starts from a small, manually created word list of commodities traded in the nineteenth century, and then uses semantic web techniques to augment the list by an order of magnitude, drawing on da...
متن کاملA Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملDigitised historical text: Does it have to be mediOCRe?
This paper reports on experiments to improve the Optical Character Recognition (ocr) quality of historical text as a preliminary step in text mining. We analyse the quality of ocred text compared to a gold standard and show how it can be improved by performing two automatic correction steps. We also demonstrate the impact this can have on named entity recognition in a preliminary extrinsic eval...
متن کاملGeostatistical Approaches for Geovisual Data Exploration, Analysis and 3D-Visualisation in Civil Security
This contribution presents selected approaches, methods and tools to facilitate geovisual analytical data exploration for civil security purposes. To analyse large emergency service data of a major German city’s fire department, different data mining techniques are applied. This allows identifying statistical significant clusters in space and time. To facilitate convenient methods for exploring...
متن کامل